A fine-grained data set and analysis of tangling in bug fixing commits

نویسندگان

چکیده

Abstract Context Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled mean they actually study not only but also other irrelevant for the of bugs. Objective We want improve our understanding prevalence tangling and types within bug fixing commits. Methods use a crowd sourcing approach manual labeling validate which contribute fixes each line Each is labeled by four participants. If least three participants agree on same label, we have consensus. Results estimate between 17% 32% all modify source code fix underlying problem. However, when consider production files this ratio increases 66% 87%. find about 11% lines hard label leading active disagreements Due confirmed uncertainty data, 3% 47% data noisy without untangling, depending case. Conclusion high can lead large amount noise data. Prior research indicates may alter results. As researchers, should be skeptics assume unvalidated likely very noisy, until proven otherwise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

a contrastive analysis of concord and head parameter in english and azerbaijani

این پایان نامه به بررسی و مقایسه دو موضوع مطابقه میان فعل و فاعل (از نظر شخص و مشار) و هسته عبارت در دو زبان انگلیسی و آذربایجانی می پردازد. اول رابطه دستوری مطابقه مورد بررسی قرار می گیرد. مطابقه به این معناست که فعل مفرد به همراه فاعل مفرد و فعل جمع به همراه فاعل جمع می آید. در انگلیسی تمام افعال، بجز فعل بودن (to be) از نظر شمار با فاعلشان فقط در سوم شخص مفرد و در زمان حال مطابقت نشان میدهند...

15 صفحه اول

A Fine-grained Analysis of a Simple Independent Set Algorithm

We present a simple exact algorithm for the INDEPENDENT SET problem with a runtime bounded by O(1.2132npoly(n)). This bound is obtained by, firstly, applying a new branching rule and, secondly, by a distinct, computer-aided case analysis. The new branching rule uses the concept of satellites and has previously only been used in an algorithm for sparse graphs. The computer-aided case analysis al...

متن کامل

Software Plans A Multidimensional Approach for Fine-Grained Tangling of Concerns in Code

To date, research in separation of concerns has focused on the development of language abstractions having a syntax for encapsulating concerns and a semantics for automatically integrating them. These mechanisms are effective at separating concerns at the granularity of the abstraction but fail at a finer level of granularity. In this paper, we characterize the nature of concern interactions an...

متن کامل

Preprocessing CVS Data for Fine-Grained Analysis

All analyses of version archives have one phase in common: the preprocessing of data. Preprocessing has a direct impact on the quality of the results returned by an analysis. In this paper we discuss four essential preprocessing tasks necessary for a fine-grained analysis of CVS archives: (a) data extraction, (b) transaction recovery, (c) mapping of changes to fine-grained entities, and (d) dat...

متن کامل

the stady and analysis of rice agroclimatology in lenjan

the west of esfahan province, iran, is one of the most important agricultural areas throughout the country due to the climate variability and life-giving water of zayanderood river. rice is one of the major and economic crops in this area. the most important climatic elements in agricultural activities which should be considered include temperature, relative humidity, precipitation and wind. so...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Empirical Software Engineering

سال: 2022

ISSN: ['1382-3256', '1573-7616']

DOI: https://doi.org/10.1007/s10664-021-10083-5